---
title: 'Embedding Models served by DeepInfra'
sidebarTitle: 'DeepInfra'
---

## Availble models

Fireworks provide a larger variety of models, here's the one we use in our example. You can replace it with any
embedding model supported by DeepInfra:

| Model              | Dimensions | Max Tokens | Cost               | MTEB Avg Score | Similarity Metric |
| ------------------ | ---------- | ---------- | ------------------ | -------------- | ----------------- |
| thenlper/gte-large | 1024       | 512        | $0.010 / 1M tokens | 63.23          | cosine            |

## Usage

Note that DeepInfra doesn't have an SDK. Their documentation shows the use of the REST API directly with a
client library in your language of choice, or you can use OpenAI's SDK - DeepInfra is compatible with OpenAI's API.

In the examples below, we use OpenAI's SDK with DeepInfra URL, API key and models.

### Installing dependencies

```bash
npm install @niledatabase/server openai
```

### Generating embeddings with DeepInfra

```javascript
// set up OpenAI SDK with Fireworks configuration
const { OpenAI } = await import('openai');
const DEEPINFRA_API_KEY = 'your deepinfra api key';

const openai = new OpenAI({
  apiKey: DEEPINFRA_API_KEY,
  baseURL: 'https://api.deepinfra.com/v1/openai',
});

const MODEL = 'thenlper/gte-large'; // or any other model supported by DeepInfra
const input_text = `The future belongs to those who believe in the beauty of their 
                    dreams.`;

// generate embeddings
let resp = await openai.embeddings.create({
  model: MODEL,
  input: input_text,
});

// OpenAI's response is an object with an array of
// objects that contain the vector embeddings
let embedding = resp.data[0].embedding;
```

### Storing and retrieving the embeddings

```javascript
// set up Nile as the vector store
const { Nile } = await import('@niledatabase/server');
const NILEDB_USER = 'you need a nile user with access to a nile database';
const NILEDB_PASSWORD = 'and a password for that user';
const nile = Nile({
  user: NILEDB_USER,
  password: NILEDB_PASSWORD,
});

// store vector in a table
await nile.db.query('INSERT INTO embeddings (embedding) values ($1)', [
  JSON.stringify(embedding.map((v) => Number(v))),
]);

// search for similar vectors
let db_resp = await nile.db.query(
  'select embedding from embeddings order by embedding<=>$1 limit 1',
  [JSON.stringify(embedding.map((v) => Number(v)))],
);

// Postgres returns an object, with array of rows each column is a
// property of the row the vector is represented as a string
let similar_str = db_resp.rows[0].embedding;
let similar = similar_str
  .substring(1, similar_str.length - 1)
  .split(',')
  .map((v) => parseFloat(v));

// check that we got the same vector back
let same = true;

for (let i = 0; i < similar.length; i++) {
  if (Math.abs(similar[i] - embedding[i]) > 0.000001) {
    same = false;
  }
}

console.log('got same vector? ' + same);
```

## Additional information

### Reducing dimensions

Using larger embeddings generally costs more and consumes more compute, memory and storage than using smaller embeddings. This is especially true for embeddings stored with `pg_vector`.
When storing embeddings in Postgres, it is important that each vector will be stored in a row that fits in a single PG block (typically 8K). If this size is exceeded,
the vector will be stored in TOAST storage which can slow down queries. In addition vectors that are "TOASTed" are not indexed, which means you can't reliably use vector indexes.

Fireworks supports multiple models. `gte-large` and `nomic-embed-text-v1.5` are two of the models available.
The `gte-large` model has 1024 dimensions and does not support scaling down. The `nomic-embed-text-v1.5` model has
768 dimensions and can scale down to 512, 256, 128 and 64.
